home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
NASA Global Data Sets for…phere Models 1987 - 1988
/
NASA Global Data Sets for Land-Atmosphere Models 1987 - 1988 - Disc 1.iso
/
peer_rev.doc
< prev
next >
Wrap
Text File
|
1995-04-10
|
25KB
|
493 lines
A REVIEW OF THE ISLSCP INITIATIVE I CD-ROM COLLECTION:
CONTEXT, SCOPE, AND MAIN OUTCOME
By Yann H. Kerr, CESBIO/LERTS
With contributions from Peter Briggs, Jim Collatz, Gerard Dedieu, Han Dolman,
John Gash, Forrest Hall, Alfredo Huete, Fred Huemmrich, John Janoviak, Randy
Koster, Sietse Los, James McManus, Blanche Meeson, Ken Mitchell, Michael
Raupach, Piers Sellers, Paul Try, Ivan Wright, and YongKang Xue.
CONTENTS
I. OVERVIEW
II. GENERAL OUTLINE OF THE REVIEW PROCESS
2.1 Stage One: Documentation Review
2.2 Stage Two: Qualitative Analysis of the CDs
2.3 Stage Three: "Hardware" Review of the CDs
2.4 Stage Four: Extensive and Quantitative Review of the CDs
III. QUALITATIVE REVIEW OF THE INITIATIVE I CD COLLECTION
3.1 Scope of the Review
3.2 Charge to the Reviewer
3.3 Organization of the Review
3.3.1 Data Types
3.3.2 Methodology
IV. OUTPUT OF THE REVIEW
4.1 Vegetation: Land Cover and Biophysics
4.2 Hydrology and Soils
4.2.1 Precipitation
4.2.2 Soils
4.2.3 Runoff
4.3 Snow, Ice, and Oceans
4.4 Radiation and Clouds
4.4.1 Radiation
4.4.2 Albedo
4.4.3 Clouds
4.5 Near-Surface Meteorology
V. CONCLUSION
I. OVERVIEW
A CD collection of global data sets has been issued within the framework of
ISLSCP Initiative I. The rationale for producing this CD set is described in
P.J. Sellers, et al. (Remote sensing of the land surface for studies of global
change: Algorithms, model, experiments. Rem. Sens. Environ. 51:1:3-26). This
collection should be of considerable interest to land-atmosphere modelers
since data sets such as these are difficult to obtain in one package. However,
there are risks involved in releasing such a collection. Scientists may
consider its contents "gospel" (especially when the data come from another
scientific community) and may misuse the data or reject them as worthless or
grossly wrong, which could discredit the ISLSCP Initiative in its entirety.
Consequently, releasing the collection with insufficient explanation had to be
avoided.
Thus, the ISLSCP Science Steering Committee decided to review the different
data sets and include the results of the review (this text) on the CDs.
Because of time constraints, only a qualitative analysis, not a full review
process, was performed. In most cases the review consisted of looking at a
subsample of the different data sets (1 or 2 months, generally January and
July), identifying obvious problems, and suggesting corrections. Most of the
corrections were made but not reviewed. Real intercomparison of similar data
sets was not performed.
Generally speaking, the review showed that the data had the correct "look and
feel." All reviewers agreed that, despite some problems, these CDs were very
useful and almost always superior or equal to existing, though scattered and
often inaccessible, data sets.
As you will most probably use one or several data sets included in this
collection, you may come up with relevant comments and a more quantitative
analysis of the contents. Consequently, we welcome all your comments toward
producing a more quantitative statement of worthiness and an improved
Initiative II data set collection on CDs. These should be sent to the editors
of the CD collection (Blanche Meeson, Code 902.2, NASA Goddard Space Flight
Center, Greenbelt, MD 20771. Email: meeson@eosdata.gsfc.nasa.gov. Voice: 301-
286-9282).
II. GENERAL OUTLINE OF THE REVIEW PROCESS
Time constraints necessitated splitting the review process into four stages as
follows:
2.1 Stage One: Documentation Review
During the September 1993 ISLSCP Science Steering Committee (SSC) meeting, it
was decided to have the documentation reviewed separately to identify errors
and omissions of material essential for novice users. Tasks also included
flagging ranges of validity, main sources of errors, relevant literature, and
integrity of the documentation; reviewing for clarity, completeness, and data
comprehension; and checking the data format description and data acquisition
information. This review led to improved documentation files--this is actually
the main core of the CD review for the ground data sets and the data sets with
a long track record (e.g., Surface Radiation Budget (SRB) data sets).
2.2 Stage Two: Qualitative Analysis of the CDs
The role of this stage is detailed in Section III. The output is given in
Section IV.
2.3 Stage Three: "Hardware" Review of the CDs
This review consisted of a quick look at a test version of the CD set (issued
in limited number: under 15 copies). The reviewers were expected to check that
the CDs are readable, the data are organized correctly, and everything is
present and similar to the original data set they reviewed as separate items
sent to them via e-mail or FTP files, etc. This was done.
2.4 Stage Four: Extensive and Quantitative Review of the CDs
This stage is for you, the user, to do. Please return your comments and
opinions to the editors of this CD collection (Blanche Meeson, Code 902.2,
NASA Goddard Space Flight Center, Greenbelt, MD 20771. Email:
meeson@eosdata.gsfc.nasa.gov. Voice: 301-286-9282). We ask that you
a) include relevant information you gathered while using the CDs by
comparing the data sets on the CDs with related data sets such as your own,
model output, and large-scale experiment results
b) suggest improvements, flag doubtful data, analyze the processing
steps, etc.--the data sets were intended to cover all of Earth's biomes; we
are sure that the quality of some of the products will vary with geographical
location and perhaps season.
This final review could culminate in a publication in the open literature and
maybe a workshop within a year or two of release of this CD collection. The
ISLSCP Science Steering Committee would also analyze the outcome of this
second review in light of Initiative II products and their ongoing program of
reviews.
III. QUALITATIVE REVIEW OF THE INITIATIVE I CD COLLECTION
3.1 Scope of the Review
The Initiative I goal was to produce CDs containing "available" state-of-the-
art data sets and products from reliable and readily available data. Toward
this purpose, reviewers were asked to check the validity and usefulness of the
data sets; identify caveats, doubtful parameters, and big mistakes; and assess
validity ranges, glaring gaps, and redundancies with other data sets. When
applicable, they also suggested improvements.
3.2 Charge to the Reviewer
"Perform a qualitative analysis (quantitative analysis whenever possible) of
the data sets from your knowledge of the discipline, cross comparisons with
other similar data sets, model output, etc."
For this purpose, the reviewer was expected to check the data sets falling
into his area of expertise and personal knowledge and answer the following
questions.
* Which is the area or biome where I have made my comparison? Are the
data grossly wrong or do they compare well with what I have seen or
measured?
* From my experience, how accurate (error bars) are the data?
* If differences are found, what are the possible explanations?
* What is the validity range of the data (i.e., range of physical
values, geographical areas, perturbating factors)?
* What are the caveats or limitations of the data?
* Are all the parameters relevant or useful?
* What other similar data sets exist?
* What are the temporal and spatial sampling characteristics?
Do they accurately reflect reality (representativity)?
Do they affect the usefulness of the data?
* For "processed" data, what is my opinion about the processing steps,
assumptions made, and impact on the output quality?
* Do I have any suggestions for improvements?
3.3 Organization of the Review
The first step is to distinguish between the different types of data sets,
since the review process might differ from one type to another.
3.3.1 Data Types
We identified three types of data sets: satellite data, ground data, and model
output. Merged data sets were considered in both categories; for example, if
a data set contained ground and satellite data it was reviewed as both
categories.
a) Satellite data--Two subcategories:
* those that are new to the research community--the review process
concentrated on two topics:
1) analysis of the methodology used to process the data (identify
caveats, oversimplistic or wrong assumptions, etc.)
2) comparison of these products to other data sets (ground, model,
experiments)--the data sets in this subcategory were the most
important to review.
* those having a long track record, such as the Surface Radiation Budget
data sets--these were reviewed similarly to model output data sets.
b) Ground data--For the most part, ground data had to be taken as given. The
review focused on the known limitations of measurement techniques, sampling
(temporal and spatial), representativity, and accuracy. Where ground data were
produced after some processing steps, the reviewers were asked to give their
opinion about the procedures used.
c) Model output--The main scope of the review was to identify questionable
model products, range or area of validity, usefulness or relevance of the
different parameters, comparison with ground data and satellite data, accuracy
and reliability, main limitations, and known problems. The problem with model
output products is that they are usually self-consistent and have the look and
feel of actual data. Novice users tend to consider these data as "truth,"
whereby specialists are aware of the limitations.
3.3.2 Methodology
The first step (Stage One) was to send the documentation to the document
reviewers for a thorough review of content and accuracy. When the document
reviewers' comments were received and incorporated, the data and documentation
were sent to the Stage Two reviewers.
Stage Two reviewers were sent the documentation along with reviewing
instructions via electronic mail. These documents introduced the data sets
that were to be reviewed and delineated the scope, charges, and schedule of
the review. The reviewers were then sent the data sets via FTP from Goddard
Space Flight Center (GSFC).
When the data had been reviewed, a meeting was held at GSFC (October 26-27,
1994) during which the data sets were analyzed by qualitative analysis of a
couple of samples from each data set. The outcome of this review was a whole
set of suggested improvements and harmonization of notations. In several
cases, alternative data sets were suggested, and, for lack of availability or
too poor quality, some data sets were replaced with others. A second meeting
took place January 4, 1995, at GSFC. During this meeting the added data sets
and the corrections made to the first-round data sets were checked. The final
output of the review process is given in Section IV.
Once all the data sets were completed, test CDs were produced and checked to
ensure that the data were properly encoded on the CDs (Stage Three, March
1995).
IV. OUTPUT OF THE REVIEW
4.1 Vegetation: Land Cover and Biophysics
The satellite data were divided into two categories: data sets having a long
track record (see 4.4) and "new products." Both types show caveats, but it was
considered that they needed pointing out only in the latter case. The
vegetation land cover and biophysics data set was analyzed as a satellite data
set in the category "new to the community" (cf. 3.3.1). It is the most
challenging data set to review and one of the most interesting on the CD set
thanks to the global coverage of several parameters of interest for the
modeling community. The user should well be aware, however, of the limitations
of this suite of parameters. The limitations are linked mainly to the
following facts.
a) Nearly the whole data set is obtained from Normalized Difference
Vegetation Index (NDVI) data. The input data consist of NDVI (2 years), a
vegetation map derived from NDVI data, and Earth Radiation Budget Experiment
(ERBE) data over the lower latitudes. Consequently, we have only two really
independent data sets in some areas and one in others, with added specific
information (respiration, C3/C4 etc.; see VEG_CLSS.DOC in the Documents
folder on the CD).
b) Any mistake or error in, say, the vegetation map will consequently
propagate in all related output files: check closely the validity over your
area of interest (Scotland seems to be covered with forest, for instance).
c) NDVI is used with all the limitations of this quantity. No atmospheric
corrections were done, but there were plenty of empirical procedures used to
suppress problems linked with cloud cover.
This leads to constant values over rainforest, for example, throughout
the year (one value is retained as good per pixel and kept for the whole
year). Consequently, there is a "jump" (not necessarily significant, though)
on December 31.
The Fourier transform tends to smooth the curve and suppress anomalies
during vegetation growth (a decrease during the growing season due to a
drought for instance) or smooth out or suppress short term evolution (i.e.,
semiarid fallow).
Sun angle correction is performed in a crude way (no relevant
information available). It might cause problems around the equinoxes and along
the scan.
In one direct comparison of these data with a higher resolution data
set gathered over the FIFE site, these NDVI product values appeared to have
lower than expected values in the middle of the growing season.
d) The relationship to extract Fraction of Photosynthetically Active
Radiation (FPAR) from Simple Ratio (SR) has been established over the Konza
prairie. But, for many biomes, shadowing effects lead to a much less linear
curve and, consequently, the obtained FPAR is sometimes largely
underestimated.
e) For defining background reflectances, the ERBE data are pasted in
areas of sparse or no vegetation and the limit is visible in some places
(central Europe) as the values differ (largely in this case) from those
obtained by assigning values by vegetation type as defined from analyses of
NDVI.
Thus, for Initiative II it is strongly recommended that a more suitable input
data set is used (actual reflectances and information on viewing and solar
angles so that artificial cleanup methods are reduced to a minimum). Basic
atmospheric corrections could then be done with water vapor from the European
Center for Medium-Range Weather Forecast (ECMWF) or similar data. SR-FPAR
relationships should be more thoroughly tested. It was also suggested to use
Bidirectional Reflectance Distribution Function (BRDF) models.
4.2 Hydrology and Soils
The data sets in this category were analyzed as "ground data" and "merged data
sets." Generally speaking, ground data have been the most difficult to gather.
After the review process it was decided to drop several data sets originally
considered for inclusion in this CD collection because they were too
unreliable or the coverage of land surfaces was too sparse to be of any use
for global modeling. Those global ground data sets that appear on the CDs were
always judged as useful or very useful, in spite of a sometimes questionable
accuracy. They are the only source of global, uniform data accessible without
the usual hassle.
4.2.1 Precipitation
The monthly precipitation data set consists of data derived from analyses of
surface gauge observations. The rainfall data set is the state of the art but
might vary in quality with geographical location. This is due mainly to the
spatial coverage available (some areas have a very sparse gauge coverage), and
to the more basic problem of temporal sampling and representativity of values
derived over a 1*1 degree grid from few, not regularly spaced ground
measurements.
The representativity is fair temporally but slightly poor spatially, as one
would expect. When compared with other (field campaign) measurements in Sahel
and Brazil some discrepancies were found that were sometimes important. This
is probably due to spatial representativity. Users should be aware of these
possible variations and should check, over a given area of interest, whether
the number of stations used over the 1*1 degree area is sufficient to give
credible results. The CD collection also holds a merged monthly satellite-
surface precipitation product at 2.5*2.5 degree resolution: this is continuous
over the land and oceans and is provided only as a browse file.
A hybrid precipitation product was generated by using the NMC GCM analysis
output and data from a large-scale observational program (GARP) to divide up
the GPCP 1 degree monthly data set described above into 6-hourly total and
convective precipitation amounts, which can then be used in conjunction with
the ECMWF 6-hourly products. The accuracy of this hybrid product is unknown.
4.2.2 Soils
The soil data set was put together from a variety of existing sources. It
contains some information on soil composition, texture, depth, and slopes. It
must be noted that these data sets are to be considered as is. They are not as
accurate as desired, and the information content might not satisfy all users.
It is, however, the state of the art, and it is thought to be not possible to
get better information on a global basis at this time. To quote a reviewer,
"There is no new information on the CDs, just a concatenation of existing data
sets. So it is clearly a case of rubbish in, rubbish out." The advantage is
that on a CD the presentation stops at the right point, that is, at the
leaping off place where the qualified expert would not dare to go. The data on
the CD are considered as better globally than other existing data sets.
However, locally (Amazon Basin, in this case) it is only equivalent to
existing data sets because of the poor source of data. Over the Amazon Basin,
it was found that the texture data are fairly accurate, but when they were
used to infer available soil moisture they proved to be questionable. This
probably applies to all areas of specific soils not well parameterized.
It must be noted that the slopes seem too high and that there is apparently a
problem over Greenland where the slopes are greater than over the Great
Cascade in Alaska. This is probably due to the way the slopes are computed
from a data set containing only three ranges. For Initiative II, the slopes
will probably have to be directly estimated from a digital elevation model.
For similar reasons, the soil type data set has limitations linked to the
input data.
4.2.3 Runoff
The runoff data set also suffers from several gaps. From a total of 34 basins,
only 14 are available for both 1978 and 1988. The consistency of the flow
rates is not very good. This data set should be used for checks since the
coverage is not global and not fully reliable. It should not be used as input
data. The efforts for Initiative II will probably have to concentrate on
improving these ground data sets.
4.3 Snow, Ice, and Oceans
These data sets are to be taken as is and considered with much care. One
should first notice that the NOAA/NESDIS data set (snow extent) covers only
the northern hemisphere. Some doubtful results were also found (USAF ETAC snow
depth) over Greenland (very high snow depth) and the Snow Cover Data Set has
some isolated anomalies; e.g., New Zealand (snow in January!). Some problems
were found also while regridding the polar stereo projection to the standard
grid used on the CD.
The ocean data sets were not reviewed.
4.4 Radiation and Clouds
These data sets were analyzed as "satellite data with a long track record."
Users should refer to the documentation file for possible caveats, terminator
effects, and so on.
4.4.1 Radiation
There are several data sets of interest in this category: the ECMWF data (see
4.5), ERBE data, Staylor and Darnell (Langley Research Center), and Pinker's.
The main problem encountered was satellite coverage that did not cover the
complete globe. Gaps were filled through an interpolation method (Pinker)
after the first review. Nevertheless, discrepancies occur at the limits of the
coverage of the different geostationary satellites. It was also found that the
radiation values were compatible with climatological values with differences
of between 10 and 20 percent in some cases, which can be attributed to
sampling and interpolation problems in the climatological data sets. In the
Sahel area, the ISLSCP radiation values well captured the seasonal variability
(+/- 20 deg. W), while in the Amazon Basin, only longwave down and net
longwave agreed with ground measurements. The shortwave down appeared bad, and
the shortwave net and total net were not very good. Moreover, the seasonality
of the signal found on the ISLSCP data set (+/- 70 deg. W) is not visible on
ground measurements. A registration error in the second part of 1987 was
detected. Globally the radiation data seem reasonably accurate with some local
problems that are largely compensated by the available global coverage. For
Initiative II it was recommended to improve the aggregation technique. In
addition, it was recommended that the authors be less vague on the description
of their procedures in the documentation file.
4.4.2 Albedo
There seem to be some registration errors in the ERBE data set (5 deg. W), at
least for some months. Significant differences were also found between the
ERBE Top of Atmosphere (TOA) albedos and the Langley surface values (higher
over the oceans and lower over the land), but it was also found that the ERBE
albedo was slightly too high (over Sahel and Amazon). It is recommended that
the documentation file clearly describe the differences between TOA and
surface so that the uneducated user has some views on that problem.
4.4.3 Clouds (and Atmospheric Data)
This data set (International Satellite Cloud and Climatology Project) was put
on CD as is. It has several problems due mainly to the different algorithms
used over sea and land (the continent contours are visible!) and to the
imperfect intercalibration of the different sensors or gap filling procedure
(vertical structure west of the Indian subcontinent linked to METEOSAT and GMS
coverage). Finally the values at the extreme latitudes seem erroneous (cloud
water for land looks strange. This data set has to be used with much care.
Some reviewers suggested discarding the cloud optical thickness and cloud
water path, but it is included here because others thought it essential.
4.5 Near-Surface Meteorology
The output of the review is very small on this data set. The problems were
a) It is very difficult to check and there was really only one data set
available at the beginning of this CD initiative (ECMWF). Model outputs are of
the self-consistent type. The model runs with various assumptions (sometimes
gross) so that the output of directly useful or checkable products makes some
sense. Consequently, some output data do not make much sense. The ISLSCP SSC
and review team did some "pruning" of seemingly worthless data and decided to
elaborate new products of use in modeling (see documentation files) from
existing data.
b) The data set arrived late and proved to be difficult to process, and
contained a large volume of data (four out of the five CDs). Thus, we did not
have much opportunity to go through it. Consequently, the data sets are to be
considered as state of the art to be taken as is, but not necessarily as
gospel. The user is strongly encouraged to read the documentation file
carefully and if not a "trained user" to ask a modeler in case of doubt.
V. CONCLUSION
The CD collection review process has been an interesting and valuable
experience. We believe that it has enabled a significant improvement of the
content. Our only regret is that the time constraint has been too strong, not
allowing the reviewers to go as deep as they would have liked in the analysis.
We believe, nevertheless, that users will provide us with their comments so
that a more complete review will eventually emerge, and the Initiative II CD
collection will benefit from user feedback and a more indepth review.
Finally, the reviewers are deeply indebted to Blanche Meeson and James McManus
who, with very short notice, made this review possible in spite of various and
complex problems they had to solve to put together these data sets and the
reviewers' "suggested" changes.